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ABSTRACT 



This report discusses how technical and technological 
advances in alternative and augmentative communication (AAC) have outstripped 
the ability to assess their impact on actual communication and argues that 
this is due in part to the lack of a consistent and reliable method to 
measure long-term communicative efficacy. The report proposes a universal 
data logging format for AAC to allow researchers and clinicians to maximize 
communication rate through an analysis of error types and machine latency 
patterns and to facilitate comparisons among different AAC approaches by 
quantifying cross-interface variations in production efficiency. The log file 
is structured such that only those parameters appropriate to a particular 
situation need be recorded. The log file consists of three parts: a header 
that specifies the content and format of the individual log file entries, a 
body consisting of an arbitrary number of new line-separated log file 
entries, and an optional analysis section containing device-generated 
statistics on logged data. Examples of the fields that could be contained in 
the log file are provided. The report closes with a discussion of how this 
new standard, when combined with the Augmentative Communication Quantitative 
Analysis package written for Microsoft Windows, promises to open new 
possibilities for the quantitative assessment of AAC technologies. (CR) 
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INTRODUCTION 



Over the past few years, technical and technological advances in augmentative communication have 
outstripped our ability to assess the impact of these advances on the actual act of communication. This is 
due in part to the lack of a consistent and reliable method to measure long-term communicative efficacy. 
It has been extremely difficult for researchers, clinicians, and manufacturers to perform the kind of 
quantitative empirical studies that are an essential counterpart to theoretical advances and qualitative 
evaluations. Without a disciplined quantitative analysis, it is hard to identify and correct problems in a 
communication interface. Although customized data logging and analysis tools have been developed for 
specific investigations (Horstmann Koester & Levine, 1994, 1996; Lesher, Moulton, & Higginbotham, 
1998), this inefficient case-by-case approach is impractical for most of the AAC community. 



The future success of technical advances in AAC will depend increasingly on complex analyses of 
user-machine interactions. A comprehensive and universal format for the automatic logging of 
communication would make such analyses possible. Improvements in human-machine interactions will 
require detailed and reliable data collection procedures that can be accomplished on all devices (Miller, 
Demasco, & Elkins, 1990). We are therefore proposing a new standard for data logging in augmentative 
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communication. Such a format will allow researchers and clinicians to maximize communication rate 
though an analysis of error types and machine latency patterns. Similarly, it will facilitate comparisons 
between different AAC approaches by quantifying cross-interface variations in production efficiency. 

The recording of AAC data has received a considerable amount of focus in the last year. Hill and Romich 
(1999a, 1999b; Romich & Hill, 1999) have been active proponents of a new standard format for the 
automatic recording of communication data. They have developed a Language Activity Monitor (LAM) 
that collects character output data from the serial port of dedicated communication devices. This data can 
later be uploaded to a computer for analysis. The LAM will allow clinicians and researchers to collect an 
unprecedented amount of data from augmented communicators using dedicated hardware devices such as 
Prentke Romich’ s Liberator or a Dynavox system. However, Hill and Romich (1999b) have proposed 
making the LAM’s data format a logfile standard foi nil communication devices. We believe that this 
format is neither flexible nor powerful enough to serve as a general-purpose standard. Computer-based 
AAC devices offer a much broader range of logging possibilities than the dedicated systems for which the 
LAM was designed. 

Since the LAM records data from the serial port, all it can store is a time stamp (generated by the LAM 
itself) and the characters output by the communication device - this is the only data that AAC devices 
generally make available through the serial port. For many augmentative paradigms, however, the 
character output is the result of a series of intermediate steps. For example, in a Minspeak environment a 
sequence of symbols must be selected before there is any message production. Similarly, in an interface 
utilizing a page-based hierarchy there may be several page navigation commands prior to message 
production. 

In addition to skipping over intermediate message production steps, the LAM format does not provide 
explicit information about the source of the text output. A word appearing in the LAM file might have 
been produced by a Minspeak sequence, a single-key word selection, a dynamic word list selection, or an 
abbreviation expansion. Higginbotham, Lesher, and Moulton (1999) have identified several types of AAC 
investigations that would be impossible without more detailed logging information than the LAM format 
can provide. 

Under the auspices the Rehabilitation Engineering Research Center on Communication Enhancement (the 
AAC-RERC, sponsored by the National Institute on Disability and Rehabilitation Research), we are 
defining a general-purpose logging standard for augmentative communication. Since the LAM represents 
the only automated recording method for dedicated communication devices, it is imperative that its 
storage format be incorporated as a subset of the proposed logfile standard. Additionally, we are 
constructing a software tool for the analysis of logfiles complying to the proposed standard. When 
completed, this tool will be freely distributed via the Internet. 

A UNIVERSAL LOGFILE FORMAT 

The definition of a universal format for AAC logging is complicated by the fact that the resulting logfiles 
will not have a single, specific use. Academic researchers, clinicians, educators, manufacturers, and 
end-users will utilize logfiles for different purposes and will therefore have widely varying data logging 
requirements. One possible solution to this quandary is to record every parameter that could be 
conceivably be interesting. Besides being extremely inefficient, such an effort is certain to fail - there are 
simply too many variables of interest in augmentative communication to comprehensively catalog them 
all. 
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To meet the varied demands of the AAC community, we propose a flexible logfile format that is powerful 
enough to support the most common data collection requirements while also providing an extendable 
framework for customized logging needs. The logfile is structured such that only those parameters 
appropriate to a particular situation (communication paradigm, AAC device, specific user, etc.) need be 
recorded. A file header specifies exactly what information will appear in the individual logfile entries, as 
well as how this information will be formatted. 

The proposed logfile consists of three basic parts: 

• A header that specifies the content and format of the individual logfile entries, 

• a body consisting of an arbitrary number of newline-separated logfile entries, and 

• an optional analysis section containing device-generated statistics on logged data. 

In addition, comments (preceded with a#) and blank lines may be positioned anywhere within the logfile. 
There are no size constraints on any part of the logfile. The file is currently limited to ASCII characters, 
although if there is significant interest the format may be extended to support Unicode (two-byte) 
characters. 

The header contains a formalized description of each field that appears in the individual logfile entries. An 
entry may consist of an arbitrary number (and ordering) of fields. The header might specify, for example, 
that each entry consists of a timestamp, followed by an indication of what kind of action triggered the 
selection, followed by the text output associated with the selection. In the body of the logfile, these 
parameters would appear separated by spaces or tabs within each entry. Besides specifying the order and 
type of the entry fields, additional field-specific details can be defined in the header. For example, the 
resolution of the timestamp can be established. 

Optionally, the header may be completely omitted. In this case, individual entries must consist of a 
timestamp followed by a text output (delimited by quotes). Since this is exactly the structure of a LAM 
record, this format is consistent with our proposed format. We are also investigating the possibility of 
allowing free-form entries from which the structure of the entries can be inferred without requiring 
explicit header information. If a header is present in the logfile, its end is indicated by a marker sequence 
($$$). 

The fields that compose each logfile entry quantify unique aspects of the selection process that produced 
that entry. For many studies, the text output may be the only aspect of interest. For other purposes, 
however, information such as the selection method or the source of the output may be important. We are 
in the process of identifying a set of fundamental parameters that can be used to quantify the 
communication process. A few instructive examples are provided below. 

• Time: A timestamp with support for varying resolution (down to hundredths of a second). The 
timestamp may be absolute, relative to the start of the logfile, or relative to the last entry. 

• Output: Text output (if any) associated with an entry, delimited by quotation marks. 

• Action: Type of user action that produced the logfile entry. For example, "keypress", "left mouse 
click", or "switch 2 closure". An action field may include additional information about the selection 
event, such as the specific key pressed or the position of the mouse at the time of a mouse button 
click. 

• Input: Type of input device used for the action that produced the logfile entry. For example, 
"touchscreen", "mouse", or "joystick". 

• Type: An indication of the type of selection that produced the logfile entry. For example, "Character 
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key", "Word list", "Page navigation command", or "Speak sentence command". The type is useful 
for determining the source of an entry. 

• Context: The words and/or characters that immediately preceded the current entry. The context 
may be of a variable length. 

• Page: A descriptive name of the page from which the entry originated. 

The number of entries in the body of the logfile is limited only by the memory available to store the file. 
The end of the body is indicated by another marker sequence ($$$). 

Following the body of the logfile, a system may optionally record some statistics on the logging session. 
There is no specific format for the data in this analysis section, nor is there any limitation on the type of 
information that can be provided. The nature of the measures recorded depend wholly upon the device 
manufacturer. For example, our IMPACT software can be configured to record the total number of 
characters and words logged during a session, as well as estimates of communication rate and keystroke 
efficiency. 



A very brief logfile example is provided below. This example was recorded using a QWERTY keyboard 
supplemented by a 5 word prediction list accessed through the function keys (FI through F5). Besides 
providing a timestamp and output information, this logfile records the source action and type of each 
selection, as well as information about the current context (useful for analyzing the effectiveness of word 
prediction). 



TIME 

OUTPUT 

TYPE 

ACTION 

CONTEXT 

Time: 12:10:39 



Absolute time 
Text output 

Type of selected element 
Selection action 
Local context 
09/29/1999 



$$$ End Header (and 


begin Body) 










12:10:41.0 


„ „ 


Shift 


key_shif t 








12:10:42.7 


"The " 


List 


key_f 1 


" " 






12:10:43.8 


"b" 


Character 


key_b 


"The 


" 




12:10:45.4 


"est " 


List 


key_f 3 


"The 


b" 




12:10:46.5 




Character 


key_t 


" The 


best 


" 


12:10:47.8 


"h" 


Character 


key_h 


" The 


best 


t " 


12:10:49.2 


■I ^ ii 


Character 


key_i 


" The 


best 


th" 


12:10:50.9 


"ng " 


List 


key_f 2 


" The 


best 


thi 



$$$ End Body (and begin Analysis) 



Time: 12:10:53 09/29/1999 
Output: "The best thing " 

Characters : 15 
Words : 3 

Characters/word: 5.00 (4.00) 
Keystrokes/character: 0.47 



In developing the logfile format, we are actively seeking feedback from persons in the AAC community. 
A complete specification of the proposed format can be found athttp://www.enkidu.net/logfile.html, 
along with instructions on how to suggest enhancements. The feedback period will continue until 
September 30, 2000, at which time the format will be fixed. 
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QUANTITATIVE ANALYSIS (ACQUA) PACKAGE 

Many of the measures commonly used in augmentative communication cannot be easily derived using 
generic statistical analysis programs. For example, keystroke savings cannot be computed without 
additional information about the baseline keystroke count. While it would be possible to write programs 
to compute most measures using commercially available statistical packages, a dedicated program for 
computing AAC-specific statistics would facilitate logfile analysis. The existence of such a program 
would also provide additional incentive for manufacturers to adopt the proposed logfile format. 

We are developing a statistical analysis program that will provide a fast and convenient means to analyze 
logfile data. The Augmentative Communication Quantitative Analysis (ACQUA) package is being written 
for Microsoft Windows. Once completed, this program will be made freely available to the AAC 
community. Besides providing AAC-related statistics, ACQUA will allow operators to filter and reformat 
logfiles for export to popular commercial analysis packages such as SPSS. The program will also serve as 
a logfile viewing tool, allowing the operator to browse through recorded data. 

In defining a set of statistics and performance measures to be incorporated in ACQUA, we are identifying 
those in common AAC usage. These include measures of language usage (for example, average sentence 
length, average word length, raw number of sentences, and vocabulary distribution), derived measures of 
communication efficiency (for example, keystrokes per character and communication rate), and 
device-specific usage measures (for example, frequency of selection for specific keys). A comprehensive 
list of ACQUA statistics can be found athttp.VAvww.enkidu.net/acqua.html. As with the logfile format, 
we are actively seeking feedback from members of the AAC community regarding which statistical 
measurements should be included in ACQUA. 

Since a logfile may consist of many days worth of communication data, ACQUA can be configured to 
analyze only specific sections of the data. The span of this data window can be defined in terms of the 
following parameters: 

• Elapsed time 

• Number of characters, words, or sentences 

• Number of logfile entries 

• Number of output entries 

• Number of discrete actions (selections) 

ACQUA can also be utilized to analyze a series of consecutive (or overlapping) data windows, providing 
a sliding estimate of the specified measures. This approach could be used, for example, to plot and 
analyze how communication rate changes with time. Such windowing can provide more specific 
information about the effectiveness of augmentative communication than can global (non-windowed) 
analysis. For example, a windowed measure of communication rate might reveal specific contexts in 
which an interface is particularly effective (or ineffective). 

SUMMARY 

We have defined the framework of a universal format for the continuous logging of augmentative 
communication. This format is flexible and powerful enough to satisfy the needs of researchers, clinicians, 
care-givers, and end-users. At the same time, it provides compatibility with simpler formats such as that 
used by the LAM. When combined with the ACQUA package for logfile analysis, this standard promises 
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to open new possibilities for the quantitative assessment of AAC technologies. These empirical studies 
will in turn serve to guide future advancements and to enhance the communication experience for users o 
current technologies. 
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